Error Resilience Evaluation on GPGPU Applications

نویسنده

  • Bo Fang
چکیده

While graphics processing units (GPUs) have gained wide adoption as accelerators for general-purpose applications (GPGPU), the end-to-end reliability implications of their use have not been quantified. Fault injection is a widely used method for evaluating the reliability of applications. However, building a fault injector for GPGPU applications is challenging due to their massive parallelism, which makes it difficult to achieve representativeness while being time-efficient. This thesis makes three key contributions. First, it presents the design of a faultinjection methodology to evaluate the end-to-end reliability properties of application kernels running on GPUs. Second, it introduces a fault-injection tool that uses real GPU hardware and offers a good balance between the representativeness and the efficiency of the fault injection experiments. Third, it characterizes the error resilience characteristics of twelve GPGPU applications. Last but not least, this thesis provides preliminary insights on correlations between algorithm properties and the measured silent data corruption rates of applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating the Error Resilience of GPGPU Applications

Over the past years, GPUs (Graphics Processing Units) have gained wide adoption as accelerators for general purpose computing. A number of studies [1, 2] have shown that significant performance gains can be achieved by deploying GPUs on traditional high performance computing (HPC) systems that host demanding scientific applications. However, the reliability implications of using GPUs are unclea...

متن کامل

Fault injection on GPGPU application

Today, with the development of GPU computing techniques in terms of architectures and hardware and software support, people realized that intensive computing workload could be ported to GPU device. Applications could exploit GPUs’ characteristics for parallel computing and gain a significantly high speedup comparing to CPU architecture. However, failures are still unavoidable. People have alrea...

متن کامل

Towards Building Error Resilient GPGPU Applications

GPUs (Graphics Processing Units) have gained wide adoption as accelerators for general purpose computing. They are widely used in error-sensitive applications, i.e. General Purpose GPU (GPGPU) applications However, the reliability implications of using GPUs are unclear. This paper presents a fault injection study to investigate the end-to-end reliability characteristics of GPGPU applications. T...

متن کامل

CrystalGPU: Transparent and Efficient Utilization of GPU Power

General-purpose computing on graphics processing units (GPGPU) has recently gained considerable attention in various domains such as bioinformatics, databases and distributed computing. GPGPU is based on using the GPU as a co-processor accelerator to offload computationally-intensive tasks from the CPU. This study starts from the observation that a number of GPU features (such as overlapping co...

متن کامل

Evaluation of Psychometric Properties of Walsh Family Resilience Questionnaire

Background: Considering the importance of family resilience and the broader range of applications that focus on the resilience of families, in the current study, the introduction of resilience structure of the family has been identified as an essential research demand. Therefore, consideration of the psychometric properties of the most widely used tools in this area, including family resilience...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014